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The aim of this work was to measure subjective speech intelligibility in an enclosure with a long 
reverberation time and comparison of these results with objective parameters. Impulse Responses (IRs) 
were first determined with a dummy head in different measurement points of the enclosure. The following 
objective parameters were calculated with Dirac 4.1 software: Reverberation Time (RT), Early Decay 
Time (EDT), weighted Clarity (Cso) and Speech Transmission Index (STI). For the chosen measurement 
points, a convolution of the IRs with the Polish Sentence Test (PST) and logatome tests was made. PST 
was presented at a background of a babble noise and speech reception threshold — SRT (i.e. SNR yielding 
50% speech intelligibility) for those points were evaluated. A relationship of the sentence and logatome 
recognition vs. STI was determined. It was found that the final SRT data are well correlated with speech 
transmission index (STI), and can be expressed by a psychometric function. The difference between SRT 
determined in condition without reverberation and in reverberation conditions appeared to be a good 
measure of the effect of reverberation on speech intelligibility in a room. In addition, speech intelligibility, 
with and without use of the sound amplification system installed in the enclosure, was compared. 
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transmission index. 


1. Introduction 


Speech recognition (e.g. in terms of intelligibility) 
performance depends on many objective conditions 
such as acoustic parameters of an enclosure, signal- 
to-noise ratio (SNR), the spectro-temporal proper- 
ties of the interfering noise, etc. Relationship be- 
tween speech intelligibility and the acoustic parame- 
ters of an enclosure has been studied by many au- 
thors (ASTOLFI et al., 2012; BRACHMANSKI, 2004; 
2008; BRADLEY, 1986a; BRADLEY et al., 2003; HOUT- 
GAST, STEENEKEN, 1985; HOUTGAST et al., 1980; JA- 
COB et al., 1991; KANG, 1998; PENG et al., 2011; 2015; 
STEENEKEN, HOUTGAST, 1980; YANG, BRADLEY, 
2009), however, it is still a challenging topic. Intelli- 
gibility in a room mainly depends on the reverberant 
conditions, which in turn depend on the listener’s posi- 
tion in the room (YANG, BRADLEY, 2009). The rever- 
beration effects speech intelligibility because of mask- 
ing phenomena in which the reflected sounds which 
come later to the listener, mask the direct speech signal 
(BRADLEY et al., 2003). Some studies demonstrated 
improvements in speech intelligibility due to early re- 
flections (BRADLEY, 1986a; 1986b; 1998; BRANDEWIE, 


ZAHORIK, 2010; HARVIE-CLARK et al., 2014). Early re- 
flection energy in real rooms is equivalent to increasing 
the level of the direct sound by up to 9 dB (BRADLEY 
et al., 2003; YANG, BRADLEY, 2009). Later-arriving 
speech sounds are usually found to be detrimental to 
the intelligibility of speech (BRADLEY, 1998). Some 
measures of early reflections are used such as EDT 
obtained from the first 10dB of decay (Standard- 
ization, 1998) and the energy index — Clarity (Cso) 
— as the ratio of early (from 0 to 50 ms) to late 
energy (over 50 ms) (BRADLEY, 1983; 1990; MAR- 
SHALL, 1994). A relative strength of the early reflec- 
tion is expressed by various measures and their inter- 
dependence. These include the rise time, early decay 
times, various ratios of early- and late-arriving sounds 
(BRADLEY, 1983), and the Speech Transmission Index 
(STI) (HouTGAST et al., 1980; STEENEKEN, HOUT- 
GAST, 1980). This index is a general measure of speech 
transmission quality and is often used to evaluate the 
influence of reverberation on speech intelligibility. Such 
influence results from the fact that reverberation de- 
creases the envelope fluctuations in speech. A decrease 
in STI causes a reduction in sentence intelligibility 
(HOUTGAST, STEENEKEN, 1984). 
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Evaluation of speech intelligibility is mainly based 
on one-syllable words rhyme tests, words, logatomes or 
simple sentence tests (BRACHMANSKI, 2008; HAGER- 
MAN, 1982; KALIKOW et al., 1977; KOLLMEIER, WES- 
SELKAMP, 1997a; NILSSON et al., 1994; OZIMEK et al., 
2009a; 2006; PENG et al., 2011; 2015; PLomp, MIM- 
PEN, 1979a; VERSFELD et al., 2000). Word rhyme tests, 
e.g. (BRADLEY, 1986b, 1990; BRADLEY et al., 2003; 
PRODI et al., 2010) or logatomes (BRACHMANSKI, 
2004; 2008; vAN WIJNGAARDEN, DRULLMAN, 2008) 
well correspond to the STI values and thus, can 
be used in real acoustical conditions (HOUTGAST, 
STEENEKEN, 1985). Those tests give the results which 
are monotonic functions of STI in a whole range of its 
values. 

Interesting suggestion is taking into account a mea- 
sure of intelligibility expressed as Speech Reception 
Threshold (SRT), defined as the SNR corresponding 
to 50% speech intelligibility. This measure is more 
phonemically representative for a given language and 
was proved to give more accurate speech intelligibility 
data than standard word tests. This is due to a rel- 
atively large slope of intelligibility functions at SRT 
point, i.e. S59. It was shown that the larger S50, the 
smaller spread of data (standard deviation) at SRT, i.e. 
the more accurate speech intelligibility measurement is 
possible (KOLLMEIER, WESSELKAMP, 1997b; NILSSON 
et al., 1994; OZIMEK et al., 2009b; 2010; PLomp, MIM- 
PEN, 1979b; WAGENER, 2003). The adaptive procedure 
with the 1-up/1-down decision rule (LEVITT, 1971) can 
be used to determine SRT. To adjust adaptive proce- 
dure to the reverberant conditions two ways can be 
chosen, namely speech intelligibility tests can be car- 
ried out both in situ or in the laboratory. The former 
means that there is a need to gather some subjects 
in the room and present them the tests. This method 
has some disadvantages, especially related to logistics 
and time consumption. The later solution seems to be 
more convenient since it can be carried out any time in 
the laboratory just by recording the signals in chosen 
places of the enclosure via a dummy head and present- 
ing the recordings in the laboratory. The most flexible 
way, however, is available by recordings of IRs via a 
dummy head instead. The recorded IRs can be used to 
a so-called auralization, which is in fact their convolu- 
tion with the test material. Then the listening session 
can be also carried out in the laboratory (ARAI et al., 
2002; BRANDEWIE, ZAHORIK, 2010; CULLING, LA- 
VANDIER, 2009; JORGENSEN et al., 1991; LONGWORTH- 
REED et al., 2008; PENG, 2007; 2008; PENG et al., 
2011; YANG, 2006). It is worth noting that the SRT 
was recently used by GEORGE et al. (2010) in mea- 
surements of the effects of reverberation and noise on 
sentence intelligibility for hearing-impaired subjects. 

The main purpose of the current study is to assess 
the speech intelligibility in normal-hearing subjects by 
measuring the SRT in the enclosure with a long rever- 


beration time. The enclosure was the church charac- 
terized by place-dependant acoustic parameters. Two 
different sound sources were used in the study, namely 
an omnidirectional loudspeaker placed at the altar and 
the sound amplification system installed in the en- 
closure. A relationship of the sentence and logatome 
recognition vs. STI was determined. Speech intelligi- 
bility, with and without use of the sound amplifica- 
tion system installed in the enclosure, was compared. 
The experimental data showed that the reverberant 
listening environment was well reflected in the SRT 
data, which were correlated with speech transmission 
index (STI). 


2. Method 
2.1. Experimental set-up 


A PC with B&K Dirac 4.1 software was used to 
record and collect impulse responses (IRs) of the en- 
closure. The software also allows calculation of the 
following objective parameters of the enclosure: RT, 
EDT, Cso and STI. To extract an IR, a Maximum 
Length Sequence (MLS) (BORISH, ANGELL, 1983; 
CHU, 1990; KUTTRUFF, 2009) technique was used as 
a driving signal instead of an impulse burst. Two differ- 
ent types of receivers were used: an omnidirectional mi- 
crophone (Svantek SV01A) and a dummy head (Neu- 
mann KU100). The former was used to get the objec- 


Fig. 1. A sketch of the tested church 

with the omnidirectional source (circle), 

sound amplification system loudspeakers 

(diamonds) and 15 measurement points 
(squares). 
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tive parameters, while the latter was used to collect 
the IRs via a head (with a HRTF) and subsequently 
convolved with the PST in the laboratory to measure 
the speech intelligibility. The MLS signal was gener- 
ated by the software and fed via a D/A converter (ESI 
U2A) to the amplifier and then to the omnidirectional 
source placed at the altar or loudspeakers placed at the 
walls. All the recordings were carried out in 15 different 
places of the church to map the acoustical properties 
of the building. 

Moreover, two different sound sources were used in 
the study: an omnidirectional loudspeaker placed at 
the altar and the sound amplification system installed 
in the church. A comparison of the speech intelligi- 
bility measurement for two sources might give an in- 
sight into the speech intelligibility improvement by the 
sound amplification system installed in the large vol- 
ume enclosure. 

Comparison of the acoustic parameter values in dif- 
ferent places of the enclosure showed a high consistence 
of the symmetrical measurement points, namely 2 
and 10, 3 and 1, 4 and 12, 5 and 13, 8 and 14, 9 and 15. 
Thus, in further analysis only data for one side of the 
enclosure (for 9 measurement points) were taken into 
account. 


2.2. Recognition test and listening sessions 


The Polish Sentence Tests (PST) presented against 
a masking noise (babble) was used in the present study 
(for details see (OZIMEK et al., 2006)). The so-called 
babble noise, made from the mixture of all sentences 
used in the test, was taken as a masker (for details 
see (OZIMEK et al., 2009b)). The power spectrum of 
the babble noise optimally matched the power spec- 
tra of the sentences. The precise spectral matching 
of masked speech and masker signal has been shown 
to be very important in getting a large steepness for 
the intelligibility function, i.e. for accurate SRT mea- 
surement. Thus, any statistically significant change in 
the SRT may be regarded as a measure of an effect 
of an external parameter (in our case a reverbera- 
tion effect). The PST was composed of 25 lists each 
containing 10 sentences. The lists have been phone- 
mically and statistically balanced. It was found that 
in anechoic conditions, the mean SRT (i.e. SNR yield- 
ing 50% speech intelligibility) was equal to —6.1 dB. 
This value was treated as one obtained in anechoic 
condition and was used in further study as reference 
value. Due to a relatively steep slope of the psycho- 
metric functions, the sentence test was shown to be 
accurate materials for speech intelligibility measure- 
ments. Additionally, the Polish Logatome Test (PLT) 
was also used (BRACHMANSKI, STARONIEWICZ, 1999). 
Logatomes (non-sense words) are usually used to assess 
the distortions made by the path the signal has to go 
pass through (electrical, acoustical, etc.). These tests 


are based on the assumption that all the phonemes of 
a logatome should be heard out correctly to repeat the 
logatome. Thus, this kind of test is very robust, how- 
ever, does not reflect a real communication process as 
sentences do. 

The so-called auralization was used, i.e. the IRs 
recorded via dummy head in all the measurement 
points were convolved with the intelligibility test in the 
computer and presented to the subjects via Tucker- 
Davis Technology (TDT) RP2 (D/A converter) and 
Sennheiser HD580 headphones. The listening sessions 
were controlled using Matlab 6.5 software. The SNR 
was modified adaptively taking into consideration the 
most recent response of a subject. If the response was 
correct, the next sentence was presented at lower SNR. 
Conversely, if the response was incorrect, the SNR of 
the next utterance was increased. During the measure- 
ment, the SNR converged to the 50%-equilibrium point 
on the intelligibility function. SRT was computed as 
a mean of the adaptively changed SNR -values (ex- 
cluding several initial values (OZIMEK et al., 2009b; 
PLoMP, MIMPEN, 1979b) or derived by fitting the 
model function to scores calculated for SNRs from the 
adaptive measurement (including also initial values) 
(VERSFELD et al., 2000). 

Twenty normal hearing subjects took part in the 
listening sessions. Their age ranged from 23 to 28 
years. They reported no problems with hearing or with 
speech reception. They had pure-tone hearing thresh- 
olds better than 10 dB HL at octave frequencies be- 
tween 0.25 and 4.0 kHz. All of them listened to the 
test convolved with the IRs recorded in each measure- 
ment point for which both the omnidirectional source 
and sound amplification system were used. A particu- 
lar list of the test was listened to only once to avoid 
the learning effect. Short training sessions were carried 
out to acquaint listeners with the task. The subjects 
were asked to write down the presented sentence. The 
subject was sited in a double-walled acoustically in- 
sulated booth. The so-called binary scoring was used 
in the assessment of speech intelligibility, namely only 
a correctly written logatome/sentence was counted as 
correctly understood and any mistake (except spelling 
mistakes) led to an incorrect note. The total level of 
the target signal presentation in the particular point 
was equal to the level measured in the enclosure dur- 
ing recordings, thus all the in situ conditions were pre- 
served. 


3. Results 
3.1. Objective parameters 


3.1.1. RT/EDT 


Standard RTs/EDTs in six octave bands, com- 
puted with Dirac 4.1 software and their mean values 
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Table 1. RT/EDT values for six octave-bands and for nine measurement points. 


RT/EDT [s] 
Octave-band frequency [Hz] 


125 250 


for nine measurement points are given in Table 1, and 
the mean RT/EDT values are presented in Fig. 2. 


Mean RT and EDT, s 


1 2 3 4 5 6 7 8 9 


Measurement point 


Fig. 2. Mean RTs and EDTs for nine measurement points. 


Analysis of the RTs in different parts of the enclo- 
sure shows that the RT varies from 3.6 s (point 2) to 
4.2 s (point 7). The general mean of RT is about 4.0 s. 
The EDT values are slightly longer than RT ones (ex- 
cept point 1 and 2). This suggests that the influence 
of reverberation on speech intelligibility might be also 
somewhat stronger, since early parts of reflection have 
a greater influence than it was suggested by RT, the 
intelligibility might be slightly higher. 


3.1.2. Weighted Clarity (C50) 


Since the speech intelligibility in an enclosure with 
a long RT is mainly related to the early reflection 
part of the sound energy (early part of an IR), the 
C5o parameter was also calculated. Figure 3 shows C59 
values versus measurement points for omnidirectional 


Weighted C, dB 


A omnidirectional source 
@ sound system 


Measurement point 


Fig. 3. Weighted Cs 9 values for nine measurement points, 
for omnidirectional source and sound amplification system. 


source and sound amplification system. The prediction 
of speech intelligibility according to Marshall’s rating 
(1994) is given in dashed lines. As this is energy ratio 
in time, the placement of the sound source as well as 
the distance between source and measurement point 
are crucial. 

Cso data suggest that the use of the sound amplifi- 
cation system increases speech intelligibility especially 
in places located at the end of the church where the 
influence of early energy is minimal. In such a situa- 
tion an increase in C59 caused by the sound amplifi- 
cation system leads to fair speech intelligibility (raise 
by two categories). Also in point 9, where there was 
almost no direct sound because of the columns of the 
arch between the target source and the measurement 
point, the speech intelligibility is bad for omnidirec- 
tional source and fair for the sound amplification sys- 
tem. In other measurement points the Cso values also 
suggest a speech intelligibility increment, however only 
by one category (from poor to fair). 
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8.1.8. Speech Transmission Index, STI 


STI values calculated for both omnidirectional 
source and sound amplification system and for differ- 
ent measurement points are depicted in Fig. 4. 
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1 2 3 4 5 6 7 8 9 
Measurement point 
Fig. 4. Speech intelligibility prediction based on the STI 
values for omnidirectional source and sound amplification 
system for nine measurement points. 


STI data are, generally, in agreement with C59 pre- 
diction. However, some differences can be found for 
points 5-8 where poor instead of bad speech intelli- 
gibility was predicted for the omnidirectional source. 
The use of the sound amplification system increases 
the speech intelligibility prediction to fair. 


3.2. Speech recognition estimation 


8.2.1. Logatome recognition 


Figure 5 depicts the mean logatome recognition 
for omnidirectional source and for sound amplifica- 
tion system. Additionally, results from anechoic cham- 
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95 —— 
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85 @ sound system 
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Mean logatome recognition, % 


1 2 3 4 5 6 7 8 9 
Measurement point 


Fig. 5. Mean logatome recognition averaged across twenty 
subjects versus nine measurement points. 


ber as a reference value are depicted with an aster- 
isks. As can be seen, logatome recognition is generally 
lower for omnidirectional source that for sound am- 
plification system. Thus, the general statement that 
the sound amplification system makes the speech in- 
telligibility much higher is confirmed. Since two-way- 
ANOVA has proven that both the way of presentation 
(omnidirectional source and sound system) and mea- 
surement point are statistically significant |F = 407, 
p < 0.001] and [F = 3, p = 0.007], respectively, the 
results were divided into two groups according to way 
of presentation. In both groups again the ANOVA was 
made to investigate whether the measurement points 
are statistically significant. For omnidirectional source 
the measurement point was proven to be statistically 
significant [F = 7, p < 0.001], however for sound sys- 
tem, the statistical significance of measurement point 
is on the border of significance [F = 2, p = 0.07], thus 
one can state that the sound system equalizes the con- 
ditions in the enclosure. Nevertheless, the results of 
ANOVA suggest that measurement points should be 
analyzed separately without any averaging. 


3.2.2. Sentence intelligibility based on SRT 


A new approach to measure speech intelligibility 
in a room, consisted in measurements speech recep- 
tion threshold (SRT) based on the Polish Sentence Test 
(PST) was undertaken. 

First, the PST was recorded in an anechoic cham- 
ber with the dummy head placed in front of the signal 
source and the reference SRT values were measured. 
A standard 1-up/1-down adaptive procedure (LEVITT, 
1971) was used to determine SRT values. In this pro- 
cedure, SNR was varied adaptively with respect to the 
most recent subject’s response. The SNR was either 
increased or decreased by some value (step) when the 
most recent response was incorrect (1-up) or correct (1- 
down), respectively. SRT was determined as the mean 
of the last 8 (from 13) nominal SNRs. The mean SRT 
(across 20 subjects) obtained in the anechoic chamber 
was equal to —6.5 dB and is shown in Fig. 7 by the as- 
terisk. In the next stage the PST was subjected to con- 
volution with the IRs recorded via dummy head in the 
tested enclosure. Subsequently, sentence intelligibility 
measurement for this condition (with reverberation) 
was performed. Figure 6 depicts the mean SRTs ob- 
tained in reverberation condition, for omnidirectional 
source (open triangles) and sound amplification sys- 
tem (black circles). A high intelligibility increase (rep- 
resented by a decrease of SRT values) can be observed 
when the sound amplification system was used. It can 
be also seen that the reverberation causes higher SD 
which suggests that the ability to understand the sen- 
tences in the reverberant conditions is more subject 
dependant. Moreover, the SRT values for the measure- 
ment points closer to the source are lower than those 
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Fig. 6. Mean SRTs obtained, averaged across twenty 
listeners versus nine measurement points. 


from the back of the church. It seems obvious since for 
the measurement points that are closer to the source, 
the early energy is higher and helps listener in speech 
intelligibility while for the most distant point, the late 
energy is higher causing the deterioration in speech in- 
telligibility (increase in SRTs). This findings are in line 
with the STI results (see Fig. 4 for details). 

Regarding logatome recognition, the same statisti- 
cal analysis was made. Two-way-ANOVA has proven 
that way of presentation is statistically significant 
[F = 618, p < 0.001] as well as measurement point 
[F = 5.5, p < 0.001], thus the results were divided 
into two groups and another ANOVA was made. For 
both omnidirectional source and sound system, mea- 
surement point was proven to be statistically signifi- 
cant [F = 13, p < 0.001] and [F = 3.5, p < 0.001], 
respectively. 


4. Discussion 


First we will discuss the influence of late energy on 
the logatome intelligibility and SRT. As can be seen 
from Fig. 3, for omnidirectional source weighted Cso 
drastically decreases for points 5-9, which are far from 
the source placed at the altar. The use of sound sys- 
tem makes the sound source much closer, thus the early 
energy is higher. As a consequence, the Cso value is in- 
creased and the speech intelligibility should be higher. 
Comparing Fig. 3 and Fig. 5 it cannot be seen, how- 
ever, that logatome recognition decreases for distant 
points (5-9) as it was predicted by Cso values. It may 
be caused by a low logatome intelligibility even for 
point which are close to the source. In such a situation 
the logatome test seems not to be sensitive enough. 
Moreover, comparing Fig. 3 with Fig. 6 (STI values 
for different points) it may be stated that for distant 
points the SRT values are higher (leading to lower 


speech intelligibility) than for close points, which sug- 
gests that the speech intelligibility for those points dif- 
fer from each other. This results suggest that SRT is 
more sensitive for changes in early/late energy ratios. 
Nonetheless, when the signal was coming from sound 
system both measures of speech intelligibility show im- 
provement which is in line with Cs9 measure. 

With regard to STI, as shown by HOUTGAST and 
STEENEKEN (2002) the results of logatome recognition 
are well correlated with STI measure. Those results 
were also confirmed by BRACHMANSKI (2004; 2008). 
It can be assumed, according to the results showed 
by HouTGAsT and STEENEKEN (2002) and BRACH- 
MANSKI (2004; 2008) that for obtained narrow range 
of STI values linear function is enough to model the 
relationship between logatome intelligibility and STI 
(see Fig. 7). The model was as follows: 


LI(STI) = A x STI + B, (1) 


where LI — logatome intelligibility, A and B, are the 
parameters to be estimated. 

For clarity both groups in the Fig. 7 are depicted 
using different symbols, however they were analyzed 
together. 


A omnidirectional source 
@ sound system 
linear fit 


mean logatome intelligibility, % 


Fig. 7. Linear fit to the logatome recognition vs. STI 
function for both sources. 


The fitting coefficient R? = 0.8 is high, and the 
parameters are as follows: A = 134.5; B = —14.3 in 
this range of STI values. It must be emphasized that for 
wide range of STI values the relationship is non-linear 
(BRACHMANSKI, 2004; 2008; HOUTGAST, STEENEKEN, 
2002). 

Nevertheless, the obtained results suggest smaller 
slope than that suggested by HOUTGAST and 
STEENEKEN (2002). This slight difference may be 
caused by the Polish language used here, which rep- 
resents, in opposite to English, the group of lan- 
guages based on fricatives which are more vulnerable 
by distortions. The results by BRACHMANSKI (2004; 
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2008), who also used Polish logatome test, confirm 
the data gathered here and have just slightly higher 
slope, but significantly lower than as suggested by 
HOUTGAST and STEENEKEN. Moreover, HOUTGAST 
and STEENEKEN suggest that for good conditions (high 
STI) logatome intelligibility aims to almost 100%, 
however it was not confirmed here: event in an ane- 
choic chamber, the logatome intelligibility reaches just 
about 80%. The same results were obtained by BRACH- 
MANSKI. 

The ANOVA results for measurement points can 
be also analyzed in terms of STI (which characterizes 
each measurement point), thus it can be stated that 
there is no difference between intelligibility scores for 
different measurements point when sound system was 
used: [F = 7, p = 0.07]. This might be a result of 
narrow range of intelligibility scores and STI obtained 
for sound system. Nonetheless, for omnidirectional case 
the statistical significance of STI was noticed [F = 7, 
p < 0.001]. Thus it may be stated that the results 
of ANOVA have proven that the use of sound system 
equalizes the intelligibility among all points of the en- 
closure. 

It is also worth mentioning that the logatome test 
does not reflect a real communication process, thus 
it is still not optimal solution for speech intelligibil- 
ity testing. Sentence test and SRT (i.e. SNR yielding 
50% speech intelligibility) seem to be more suitable 
measure here as they reflect the effect of distortions 
on real sentences, and, what is more, SRT in more 
sensitive for any change in conditions than classical 
speech intelligibility measured in percents. Therefore, 
the same analyzing procedure as for logatome was ap- 
plied to SRT measure. According to previous finding 
of SRT (OZIMEK et al., 2009a; 2006) in such a range of 
SRT changes, a typical psychometric relation modeled 
by the logistic function can applied here to describe 
the SRT vs. STI relationship for both omnidirectional 
source and sound system at once (Fig. 8). Again for 
clarity, two different ways of presentation are depicted 
using different symbols, however to all points one curve 
was fitted. This function can be expressed by Eq. (2): 


Ay A 
SRT(STI) = A4 + ———> 


where Aj, Ag, £o and p are parameters to be estimated. 

The fitting coefficient R? = 0.98 is very high which 
suggests that the relationship is of a psychometric type 
in the analyzed STI range. The parameters of the curve 
are as follows: A, = 6.32, Ag = —5.79, £o = 0.44, 
p = 16.68. 

As can be seen from Fig. 8, sentence speech intel- 
ligibility determined by SRT is nonlinear function of 
STI. The most sensitive range of SRT relative to STI 
(the steepest slope of the psychometric function cov- 
ers the range of STI changes from 0.38 to 0.48 (about 


mean SRT, dB 
oO 


0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 
STI 


Fig. 8. SRT vs. STI and logistic curve fitting for averaged 
across subject SRT values. 


8 dB SRT/0.1 STI). Below and above this range, the 
SRT is less and less sensitive versus STI changes and 
much outside this range is practically independent of 
STI and reaches its minimum at about 0.55. This is 
because for this STI value SRT reaches the lowest pos- 
sible value which is equal to the one obtained for ane- 
choic conditions. This statement is the new and main 
finding resulting from the present study. Again it must 
be stated that only reverberant conditions (with no 
additional noise in the enclosure) were tested, thus 
the hearing-in-noise-test (HINT) used here gives the 
insight in the influence of reverberant conditions and 
amplification system on the intelligibility expressed in 
terms of SRT. Moreover, such a test is very sensitive to 
any condition change (like reverberation or amplifica- 
tion), thus gives very reliable date on the relationship 
between STI and SRT. However, for the STI values 
over 0.6 the dependency between these quantities will 
not be found because for anechoic conditions (which 
can be found as most “sterile” ones, with no convolu- 
tive distortions at all, only with additive distortion of 
masking noise) the SRT values reach about —6.5 dB 
which is obtained here for STI values of about 0.55. 
To assess a relative change in SRT caused by re- 
verberation, a differences ASRT between SRTan ob- 
tained for non-reverberant condition and reverberant 
condition (SRTrev) were calculated for each measure- 
ment point. Zero value for such differences (ASRT = 
SRT an — SRT, ey) means that speech intelligibility was 
not changed under reverberant condition, while pos- 
itive values mean that speech intelligibility increased 
and negative values mean that the speech intelligibil- 
ity decreased under reverberant condition (see Fig. 9). 
Calculation of the ASRT allows to estimate the effect 
of the sound amplification system on speech intelligi- 
bility in reverberant listening conditions. As shown in 
Fig. 9, for omnidirectional source a significant decrease 
in speech intelligibility can be noticed, while for sound 
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Fig. 9. Relative SRT changes for nine measurement points. 


amplification system, speech intelligibility is generally 
less sensitive to a reverberant condition. The results 
of speech intelligibility obtained in the tested enclo- 
sure with a sound amplification system showed that 
it could be significantly improved (on average 30%). 
This is in line with known principle indicating that 
to improve speech intelligibility in a room with long 
reverberation time a well-designed sound amplifica- 
tion system should be used. It was shown in the cur- 
rent study that the SRT is a reasonable good indica- 
tor which well quantifies sentences speech intelligibility 
in reverberant conditions. The differences in SRT be- 
tween measurement conditions (non-reverberant and 
reverberant) were statistically significant and were de- 
pendent on the location of the measurement point in 
an enclosure. 

The significant decrease in speech intelligibility in 
reverberant conditions relative to intelligibility in non- 
reverberant condition (in noise only) indicates that lis- 
tening to speech in reverberant environment is more 
difficult and requires higher cognitive abilities than lis- 
tening to speech in noise which is mainly governed 
by the SNR and auditory profile. Thus, in the fu- 
ture research, the SRT may be applied to investigate 
the importance of cognitive and temporal processing 
in speech performance in reverberant listening condi- 
tions. 


5. Conclusions 


This study allows to draw the following conclusions: 


e Presentation of the sentence test against bab- 
ble noise in a room is more reliable method 
of speech intelligibility measurement than the 
logatome test, especially for high reverberation 
conditions. 


e The SRT method used here seems to be more 
sensitive to changes in acoustic conditions of the 


room with a long reverberation time, especially 
for early /late energy ratio changes. 


e For a room with a long reverberation time and 
rich architecture, the logatome vs. STI relation- 
ship can be modeled by linear function, but the 
sentence speech intelligibility expressed in terms 
of SRT vs. STI should be modeled using psycho- 
metric function. 


e The most sensitive range of the SRT relative to 
STI changes corresponds to the middle range of 
STI values (around 0.35-0.5). Below and above 
this range, sentence intelligibility expressed in 
terms of SRT is much less dependent of STI 
changes (and significantly beyond this range is 
practically independent of STI changes). 


e Difference ASRT between SRT anechoic and 
SRT yeververant Seems to be a good measure of the 
room reverberation effect on speech intelligibility. 
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